The general structure of hypothesis testing statistics ............
This section is not a comprehensive guide to statistics. It is
intended to remind you of the general format of hypothesis testing
statistics. It does not tell how to do any actual tests. For examples
of tests go to the appropriate section.
Even if you think you know nothing about statistics, it's almost certain that you do. You will probably have heard of terms like the "average" value or maybe the "range" of some data or possibly its "standard deviation" or "variance". All of these things tell you something about a set of data, they are known as descriptive statistics. The statistics we are concerned with here are called hypothesis-testing statistics. For the most part you will be using them to compare one set of data with another set of data.
The format you follow is similar for most of these tests:
Invent a null hypothesis
Using your devastating powers of observation, you have noticed
that adult humans seem on the whole to be taller than baby ones.
You could easily investigate this by measuring some of each kind
and comparing the average values of the two sets of data.
An hypothesis is simply a statement which offers an explanation
of your observations. In this case our experimental hypothesis
might be that all adult humans have had special cosmetic surgery
to lengthen their legs and make them taller than babies. Alternatively,
we might suggest that adults have been around for longer and therefore
have grown bigger. Both of these would be experimental hypotheses,
the latter being the more reasonable one.
A null hypothesis is a special sort of hypothesis which you invent
purely for the purpose of doing the statistical test. It does
not have to agree with your experimental hypothesis. The word
null means a condition of nothingness or lacking any distinction.
A null hypothesis is sometimes called an hypothesis of no difference.
It is always stated as though there were no difference between
the two things you are comparing. If we were doing a test that
compared our two average (or mean) heights a suitable null hypothesis
would be:
There is no significant difference between the means of the two
sets of data
Remember, it might be obvious that there is a difference but you
state it like this anyway. Having done the statistical test you
will end up either accepting or rejecting this statement.
Calculate the value of the
test statistic
All the tests do something different but the general pattern of
what you do is the same. The next thing you do is use your data
to calculate a value of the test statistic you are using (this
will have a name, usually a letter like "t",
"U", "rs").
You calculate a value that is specific for
your data.
Find the critical value of
the test statistic
Statisticians are very clever (except for Heronimous Bing of Oxford,
he is thick), they have spent a long time working out what are
known as critical values of test statistics for all combinations
of circumstances and sets of data. You must extract from one of
their tables of critical values the value that applies to your
combination of circumstances. What the value is depends on the
number of items of data in each data set and the degree of precision
you want to use in either accepting or rejecting your null hypothesis.
This is the real value of these techniques, they allow you to
say how certain you are when you either accept or reject the null
hypothesis. You get to choose how certain you want to be.
Here is part of a table of critical values for a statistic called Spearman's rank correlation coefficient:
| No. of pairs of data (n) |
Significance |
Level |
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
This is where you get to pick the degree of precision you want
in either accepting or rejecting your null hypothesis. Lets us
say that you wanted to be as certain as you could be (using our
table above) that you would be correct in accepting or rejecting
your null hypothesis. Enter the table at the 1% significance column
and find the appropriate critical value by going along the appropriate
row. For the sake of argument let's say we have 10 pairs of data.
As you can see the critical value is 0.794.
With this particular test, if the value you've calculated for your own data is the same or bigger than this you reject the null hypothesis. If the value for your data is smaller than the critical value you accept the null hypothesis. In accepting or rejecting it at the 1% significance level you are saying: "If I did this test a very large number of times I would expect to be correct in accepting or rejecting my null hypothesis 99% of the time. I would expect a different result due to chance only 1% of the time". Put simply (and not quite accurately but hopefully you know what I mean): "I'm 99% certain that I'm right in accepting or rejecting my null hypothesis".
If you are not so concerned with being near certain you can pick
a bigger % significance level. If you picked the 5% level the
critical value would be 0.648. This is smaller than the critical
value for 1% significance and it will be easier for your value
(calculated from your own data) to beat it and reject the hypothesis
of no difference. However if you do it at this level you would
expect different results due to chance 5% of the time. In other
words 95 times out of a 100 you'd expect to be correct in accepting
or rejecting your null hypothesis. 5 times out of 100 you'd expect
a different result due to chance.
There is no law about what level of significance you choose but given the inherent variability of biological systems (or cussedness) it has become generally accepted that a level of 5% is acceptable for field data.
Looking for a next step?
The FSC offers a range of publications, courses for schools and colleges and courses for adults, families and professionals that relate to the seashore environment. Why not find
out more about the FSC?
FEEDBACK
Do you have any questions?
Copyright © 2008 Field Studies Council

Creative Commons Attribution-Noncommercial-No Derivative Works 3.0 Licence .
Site Statistics by Opentracker